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INTRODUCTION 


One  of  the  recommendations  from  a  series  of  Defence  Cost  Studies  recently  conducted 
in  the  UK  was  that  consideration  should  be  given  to  the  establishment  of  a  single  initial 
selection  test  for  tri-service  rating/other  rank  use.  As  well  as  providing  economies  in 
terms  of  future  test  development,  maintenance  and  validation,  this  recommendation  was 
particularly  pertinent  given  the  age  of  the  existing  Royal  Navy  (RN)  and  Royal  Air 
Force  (RAF)  batteries  and  the  considerable  cost  of  renewing  these.  The  Selection  Testing 
Working  Group  (STWG)  was  re-formed  to  investigate  the  potential  of  establishing  a 
single  tri-service  test  and  has  commissioned  a  number  of  studies  to  look  at  the 
inter-relationship  of  the  different  service  selection  batteries. 


This  paper  draws  upon  the  findings  from  a  series  of  studies  sponsored  by  the  STWG.  It 
looks  at  the  relationship  between  the  BARB  and  NPS  tests  and  existing  UK:  service 
selection  tests.  The  studies  provide  insights  into  the  construct  validity  of  the  BARB  and 
NPS  batteries  and  the  scope  for  rationalising  initial  selection  testing. 


THE  BARB  AND  NPS  BATTERIES 


The  British  Army  Recruit  Battery  (BARB)  is  the  British  Army's  initial  selection  test.  It 
has  been  in  service  since  1992.  Computer  based,  the  test  uses  item-generation  theory  to 
generate  and  deliver  a  unique  set  of  items  to  each  candidate.  Interaction  with  the 
computer  is  by  a  touch-sensitive  monitor.  The  battery  currently  consists  of  six  scored 
sub-tests,  five  of  which  are  simple  cognitive  tasks  that  map  onto  Carroll's  second  order 
psychometric  constructs  and  ultimately  contribute  to  the  third  order  factor  of  general 
intelligence  (Carroll,  1993).  The  sixth  test,  a  vocabulary  task,  does  not  share  the  same 
theoretical  underpinning,  but  can  be  viewed  as  mapping  onto  Carroll's  second  order 
construct  of  crystallised  intelligence.  A  composite  score  referred  to  as  the  GTI  is  the 
main  output  from  the  battery. 


The  NPS  battery  is  a  pencil-  and  paper-based  experimental  battery  developed  for 
evaluation  by  the  Royal  Navy.  It  consists  of  two  main  parts:  the  ABC  tests  and  the  _ 
numeracy  &  literacy  tests.  The  ABC  tests  consist  of  five  subtests  which  share  the  same 
theoretical  basis  as  the  BARB  sub-tests  (four  of  the  five  tests  have  very  similar  item 
types).  The  ABC  tests  are  supplemented  by  the  numeracy  and  literacy  tests.  These  do  not 
share  the  same  conceptual  basis  as  the  ABC  tests,  but  as  with  the  BARB  vocabulary  task, 
would  appear  to  load  onto  the  crystallised  intelligence  factor.  A  composite  score  referred 
to  as  NPS  Total  is  the  main  output  from  the  battery,  although  a  separate  composite,  the 
ABC  total,  is  computed  from  the  ABC  tests.  The  sub-tests  and  the  overlap  between  the 
BARB  and  NPS  batteries  are  illustrated  in  Figure  1. 


OTHER  SERVICE  TEST  BATTERIES 
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The  current  RN  test  battery  is  called  the  Recruiting  Test  (RT).  The  RAF  battery  is  called 
the  Ground  Trades  Test  Battery  (GTTB).  The  two  batteries  have  been  in  service,  with 
revisions,  since  the  1940/1 950s.  The  theoretical  basis  of  the  batteries  can  be  traced  to 
Spearman's  seminal  work  on  the  structure  of  the  intellect  (Spearman,  1927).  Each  of  the 
batteries  consist  of  four  sub-tests  measuring  general  intelligence  through  Spearman's 
verbal  education  and  spatial/mechanical  factors.  The  GTTB  contains  an  additional  two 
attainment-loaded  tests  for  technician  selection;  these  are  taken  by  only  a  proportion  of 
RAF  applicants.  The  composite  score  from  the  RT  is  referred  to  as  the  RT  Total.  The 
GTTB  produces  two  composites:  the  GM,  formed  from  the  four  subtests  measuring 
general  intelligence,  and  the  GTI,  formed  from  the  GAI  and  the  attainment-loaded 
technician  tests. 


Figure  L  The  interrelationshin  of  the  BARB  and  NFS  test  batteries 

CONTRIBUTING  STUDIES 

A  number  of  studies  were  commissoned  by  the  STWG.  In  these  studies, 
applicants/entrants  from  each  of  the  services  sat  a  further  test  battery  in  addition  to  the 
one  they  had  taken  for  selection.  The  studies  which  are  reviewed  in  this  paper  are: 

•  BARB  vs  GTTB  (Kitson  &  Elshaw,  1996) 

.  NFS  vs  GTTB  (Bailey,  1996) 

•  NFS  vs  RT  (Jones,  Dennis  &Collis,  1995)  and 

•  BARB  vs  NFS  ABC  (Friceetal.,  1996) 

BARB  vs  GTTB 

In  this  study,  a  sample  of  428  army  applicants  took  the  GTTB  whilst  attending  a  Recruit 
Selection  Centre.  All  applicants  had  previously  taken  BARB  as  part  of  the  selection 
process.  The  delay  between  taking  BARB  and  GTTB  was  approximately  one  month. 
Analysis  of  the  data  produced  a  correlation  between  BARB  GTI  and  GTTB  GAIof  0.66. 
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When  corrected  for  the  unreliability  of  the  two  tests, a  correlation  of  0.77  was  obtained. 
Table  1  shows  the  correlation  matrix  for  the  BARB  and  GTTB  subtests. 


GTTB 

BARB 

RAF  GTTB 

Alphabet 

F/B 

Letter 

Checking 

Number 

Distance 

Symbol 

Rotation 

Synonyms 

/Antonyms 

Transitive 

Inference 

G6  Reasoning 

0.32 

0.31 

0.45 

0.31 

0.55 

0.46 

G7  Non-Ver 
Reasoning 

0.27 

B 

0.40 

0.30 

N7  Arithmetic 

0.50 

0.31 

0.36 

0.33 

V5  Word 
Knowledge 

0.32 

0.27 

0.31 

0.24 

0.61 

All  correlations  p  <  0.01 

Table  1.  Table  showing  BARB  and  GTTB  subtest  intercorrelations 

The  correlations  shown  in  the  table  range  from  low  to  good.  Some  of  the  correlations  are 
encouraging  and  provide  evidence  to  support  the  construct  validity  of  some  of  the  BARB 
tests  (e.g.,  number  distance  and  arithmetic,  synonyms/antonyms  and  word  knowledge, 
etc.).  A  factor  analysis  of  the  combined  subtests,  including  the  two  GTTB  attainment 
loaded  tests,  yielded  a  two  factor  solution.  The  core  BARB  tests  and  three  of  the  GTTB 
core  tests  (excluding  word  knowledge)  loaded  onto  the  first  factor.  The  second  factor 
comprised  the  BARB  synonyms/antonyms  test,  the  GTTB  word  knowledge  test,  and  the 
GTTB  attainment-loaded  tests.  These  results  indicate  that  both  batteries  are  measuring  a 
common  g  factor  as  well  as  slightly  more  VEd/crystallised  g  factor. 

NFS  vs  GTTB 

In  this  study,  384  RAF  recruits  in  basic  training  took  the  NFS  battery  of  tests.  All  the 
recruits  had  previously  taken  the  GTTB  as  part  of  their  selection  into  the  RAF.  The  delay 
between  taking  the  GTTB  and  NFS  is  believed  to  be  up  to  several  months.  Correlations 
between  GTTB,  GAI,  and  NFS  total  and  ABC  total  were  calculated  to  be  0.66  and  0.60, 
respectively  (0.77  and  0.71  when  corrected  for  unreliability).  The  correlation  matrix  for 
the  GTTB  and  NFS  subtests  is  shown  in  Table  2. 
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NPS/ABC 

GTTB 

Letter 

Distance 

Number 

Distance 

Symbol 

Rotation 

Reasoning 

Numeracy 

G6 

Reasoning 

0.38 

0.34 

0.49 

0.35 

B 

0.61 

G7 

Non-Ver 

Reasoning 

0.32 

0.36 

0.40 

0.41 

0.40 

N7 

Arithmetic 

0.41 

0.35 

0.50 

0.30 

0.38 

0.68 

V5  Word 
Knowledge 

0.25 

0.20 

0.26 

0.05+ 

0.25 

0.66 

0.39 

El  Elec 
Knowledge 

0.06+ 

0.14 

0.25 

0.30 

B 

0.33 

0.41 

M2  Craft 
Knowledge 

0.06+ 

0.08+ 

0.12* 

0.20 

0.13* 

0.29 

■^Not  significant  *p  <  0.05  All  other  correlations  p  <  0.01 
Table  2.  Table  showing  GTTB  and  NFS  subtest  intercorrelations 

Once  again,  the  correlations  shown  in  the  table  range  from  low  to  high.  The  pattern  of 
correlations  is  much  as  anticipated  and  provides  evidence  for  the  construct  validity  of  the 
NFS  batteiy.  A  factor  analysis  was  undertaken,  which  yielded  a  three-factor  solution. 

The  majority  of  the  subtests  loaded  onto  the  first  factor.  The  two  attainment-loaded 
technician  tests  formed  the  second  factor,  and  the  third  factor  comprised  the  NFS  literacy 
test  and  the  GTTB  word  knowledge  test. 

NFS  vs  RT 

In  this  study,  the  findings  from  the  analysis  of  data  from  1,988  RN  applicants  who  sat 
the  NFS  battery  are  reported.  The  applicants  sat  the  NFS  battery  approximately  one  week 
after  sitting  the  RT.  Correlations  between  ABC  total  and  NFS  total  with  RT  Total  of  0.59 
and  0.67  were  obtained  (corrected  correlations  0.70  and  0.78  respectively).  Subtest 
correlations  are  shown  in  Table  3.  Again,  these  range  from  low  to  high  and  once  again 
their  pattern  generally  supports  the  construct  validity  of  the  NFS  battery.  A  factor 
analysis  of  the  combined  batteries  produced  a  four-factor  solution,  with  only  the  first 
three  factors  being  readily  interpretable.  The  ABC  tests  and  the  two  numeracy  tests 
loaded  onto  the  first  factor.  The  two  literacy  tests  loaded  onto  the  second  factor,  and  the 
symbol  rotation  and  mechanical  comprehension  tests  loaded  onto  the  third  factor. 
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NFS 

ABC 

RT 

Letter 

Distance 

Letter 

Checking 

Number 

Distance 

Symbol 

Rotation 

Reasoning 

Literacy 

Numeracy 

RTI 

General 

Reasoning 

0  40 

0.31 

0.44 

0.44 

B 

0.59 

0  39 

0.31 

0.35 

0.30 

0.41 

0.75 

0.54 

RTS 

Numeracy 

0  46 

0.31 

0.35 

0.38 

0.49 

0.55 

0.73 

RT4 

Mechanical 

0  19 

0.15 

0.28 

0.38 

0.27 

0.41 

0.37 

All  correlations  p  <  0.01 

Table  3.  Table  showing  NFS  and  RT subtest  intercorrelations 

BARB  vs  NFS  ABC 

In  this  study  353  army  applicants  at  RSC  were  administered  the  NFS  ABC  tests  having 
previously  sat  BARB.  The  delay  between  the  two  test  administrations  was  approximately 
one  month.  A  correlation  between  BARB  GTI  and  ABC  Total  of  0.69  was  obtained  (0.77 
when  corrected  for  unreliability).  The  inter-correlations  between  the  subtests  are  shown 
in  Table  4.  These  intercorrelations  range  from  low  to  high.  High  correlations  can  be  seen 
between  the  two  number  distance  tests  and  the  two  symbol  rotation  tests.  The 
correlations  between  the  respective  transitive  inference  and  reasoning  tests  and  the  two 
letter  checking  tests  are  moderate. 


BARB 

ABC 

Alphabet 

F/B 

Letter 

Checking 

Number 

Distance 

Symbol 

Rotation 

Sjmonyms 

/Antonyms 

Transitive 

Inference 

Letter 

Distance 

0.36 

0.28 

0.41 

0.27 

0.39 

0.42 

Letter 

Checking 

0.25 

0.42 

0.26 

0.25 

0.18 

0.31 

Number 

Distance 

0.23 

0.23 

0.71 

0.23 

0.33 

0.41 

Symbol 

Rotation 

0.23 

0.21 

0.32 

0.71 

0.31 

0.30 

Reasoning 

0.21 

0.27 

0.43 

0.19 

0.41 

0.47 
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All  correlations  p  <  0.01 

Table  4  Table  showing  NFS  ABC  and  BARB  subtest  intercorrelations 

A  factor  analysis  of  the  subtests  yielded  a  two-factor  solution  with  the  two  number 
distance  tests,  the  transitive  inference  and  reasoning  tests,  the  letter  distance  and  letter 
checking  tests  loading  on  the  first  factor.  The  second  factor  was  made  up  of  the  two 
symbol  rotation  tests  and  the  two  letter  checking  tests. 

SUMMARY  OF  FINDINGS 

A  summary  of  the  composite  intercorrelations  from  the  different  studies  is  shown  in 

Table  5. 


BARB  GTI 

ABC  Total 

NFS  Total 

GTTB  GAI 

RT  Total 

BARB  GTI 

0.69 

(0.77) 

? 

0.66 

(0.77) 

? 

ABC  Total 

0.60 

(0.71) 

0.59 

(0.70) 

NFS  Total 

0.66 

(0.77) 

0.67 

(0.78) 

GTTB  GAI 

? 

RT  Total 

All  correlations  p  <  0.01 ;  ( )  denotes  correction  for  unreliability 
Table  5.  Summary  table  showing  inter-correlations  of  the  different  composite  scores 

DISCUSSION 

All  the  current  and  proposed  UK  service  selection  tests  were  designed  to  measure  the 
construct  of  general  intelligence.  The  core  subtests  of  the  BARB  and  NFS  batteries  are 
based  upon  Carroll's  three-stratum  model  of  the  intellect,  whilst  the  RT  and  GTTB  tests 
are  based  around  the  work  of  Spearman.  The  findings  of  the  studies  reported  in  this 
paper  show  considerable  overlaps  between  all  the  batteries  and  support  the  view  that  all 
the  batteries  are  measuring  general  intelligence,  although  perhaps  in  slightly  different 
ways.  There  would  appear  to  scope  for  the  rationalisation  of  current  service  selection 
tests  and  the  introduction  of  a  single  test  for  tri-service  use.  Practical  constraints  placed 
limitations  on  the  collection  of  data,  which  were  collected  operationally  as  part  of  the 
selection  process.  The  delay  between  initial  testing  and  retesting,  practice  effects,  and 
motivational  effects  may  all  have  served  to  limit  the  correlations  observed. 

An  initial  surprise  in  the  findings  was  the  relatively  low  correlation  between  the  BARB 
and  NFS  batteries.  Given  the  fact  that  a  considerable  number  of  the  subtests  share  a 
common  theoretical  underpinning,  higher  correlations  were  expected.  Bartram  (1994) 
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and  Mead  and  Drasgow  (1993)  give  useful  reviews  of  the  equivalence  of  pencil  and 
paper  and  computerised  versions  of  tests.  Mead  and  Drasgow's  meta-analysis  found  that 
power-based  tests  transfer  quite  well  across  media,  whereas  this  was  often  not  the  case 
for  speeded  tests.  Modality  of  presentation  would  appear  to  have  had  a  significant  impact 
on  testees’  performance  across  the  BARB  and  NPS  tests. 
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